#First, we loaded the package that had previously installed in the console:

library(sf)
## Linking to GEOS 3.8.1, GDAL 3.2.1, PROJ 7.2.1
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.3     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   2.0.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(ggspatial)
library(ggthemes)

We downloaded two sets of data, one from the City of Cambridge’s GIS portal to display the spatial location of the MBTA stops. The second was data from the MBTA open data portal on ridership at each MBTA stop across various times of the day. We narrowed down the scope of this project to focus on only the parts of the MBTA Red Line that go through Cambridge and only on ridership during the morning commuter peak.

stations <- st_read("TRANS_SubwayStations-2")
## Reading layer `TRANS_SubwayStations' from data source 
##   `/Users/ariellerawlings/Desktop/Fall 2021/2128_Spatial Analysis/Assignment 1/2128-Assignment-1/TRANS_SubwayStations-2' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 9 features and 2 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 752981.6 ymin: 2956937 xmax: 772954.5 ymax: 2969832
## Projected CRS: NAD83 / Massachusetts Mainland (ftUS)
ridership <- st_read("MBTA Ridership.csv")%>%
  mutate(STATION = toupper(stop_name))
## Reading layer `MBTA Ridership' from data source 
##   `/Users/ariellerawlings/Desktop/Fall 2021/2128_Spatial Analysis/Assignment 1/2128-Assignment-1/MBTA Ridership.csv' 
##   using driver `CSV'
## Warning: no simple feature geometries present: returning a data.frame or tbl_df
MBTA_ridership_AM_PEAK <- st_as_sf(left_join(ridership, stations, by = "STATION"))%>%
  filter(time_period_name == "AM_PEAK")

MBTA_ridership_AM_PEAK$average_flow <-
  as.numeric(as.character(MBTA_ridership_AM_PEAK$average_flow))

Our first plot confirmed that we could successfully plot the number of riders during the morning peak for each MBTA station. We tried to separate the “inbound” riders from the “outbound” riders, but we were unable to figure out the code to do so. We will continue to play with the various functions to learn how to filter variables.

ggplot(MBTA_ridership_AM_PEAK)+
  geom_sf(aes(size = average_flow,
              color = stop_name),
          alpha = 0.25)

We then downloaded GIS data to show the boundary of Cambridge. This serves as our polygon layer.

boundary <- st_read("BOUNDARY_CityBoundary.shp")
## Reading layer `BOUNDARY_CityBoundary' from data source 
##   `/Users/ariellerawlings/Desktop/Fall 2021/2128_Spatial Analysis/Assignment 1/2128-Assignment-1/BOUNDARY_CityBoundary.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 7 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 747914.2 ymin: 2953731 xmax: 773977.9 ymax: 2972374
## Projected CRS: NAD83 / Massachusetts Mainland (ftUS)

Our first map shows the Cambridge boundary, outlined in red, and the average flow of riders during the morning peak at each red line station.

ggplot(MBTA_ridership_AM_PEAK)+
  annotation_map_tile(zoomin = 0, progress = "none", type = "cartolight")+
  geom_sf(aes(size = average_flow,
              color = stop_name),
          alpha = 0.25)+
  labs(caption = "Map tiles and data by OpenStreetMap")+
  geom_sf(data = boundary,
          color = "red",
          size = 1,
          fill = NA)
## Loading required namespace: raster

We realized that it would be helpful to outline the path of the MBTA Red Line through Cambridge on our next map, so we downloaded GIS data that shows the path of the Red Line.

subway_line <- st_read("TRANS_SubwayLines")
## Reading layer `TRANS_SubwayLines' from data source 
##   `/Users/ariellerawlings/Desktop/Fall 2021/2128_Spatial Analysis/Assignment 1/2128-Assignment-1/TRANS_SubwayLines' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 3 features and 4 fields
## Geometry type: LINESTRING
## Dimension:     XY
## Bounding box:  xmin: 752981.6 ymin: 2956732 xmax: 775355.6 ymax: 2970215
## Projected CRS: NAD83 / Massachusetts Mainland (ftUS)
subway_line_red <- st_as_sf(subway_line) %>%
  filter(LINE == "RED")

This next map includes the outline of the Red Line.

ggplot(MBTA_ridership_AM_PEAK)+
  annotation_map_tile(zoomin = 0, progress = "none", type = "cartolight")+
  geom_sf(aes(size = average_flow,
              color = stop_name),
          alpha = 0.5)+
  labs(caption = "Map tiles and data by OpenStreetMap")+
  geom_sf(data = boundary,
          color = "red",
          size = 1,
          fill = NA)+
  geom_sf(data = subway_line_red,
          color = "cadetblue3",
          size = 1,
          fill = NA)

We began to play with color and introduced new colors, including “firebrick4” and “darkslategray4.” We also changed to representing the boundaries of Cambridge with a fill instead of an outline.

ggplot(MBTA_ridership_AM_PEAK)+
  annotation_map_tile(zoomin = 0, progress = "none", type = "cartolight")+
  geom_sf(aes(size = average_flow,
              color = stop_name),
          alpha = 0.5)+
  labs(caption = "Map tiles and data by OpenStreetMap")+
  geom_sf(data = boundary,
          color = NA,
          size = 1,
          fill = "firebrick4",
          alpha = 0.5)+
  geom_sf(data = subway_line_red,
          color = "darkslategray4",
          size = 1,
          fill = NA)

Next, we played around with the base map. Here, the basemap displayed is “stamenbw” instead of “cartolight” (which we were using previously).

ggplot(MBTA_ridership_AM_PEAK)+
  annotation_map_tile(zoomin = 0, progress = "none", type = "stamenbw")+
  geom_sf(aes(size = average_flow,
              color = stop_name))+
  labs(caption = "Map tiles and data by OpenStreetMap")+
  geom_sf(data = boundary,
          color = NA,
          size = 1,
          fill = "firebrick4",
          alpha = 0.5)+
  geom_sf(data = subway_line_red,
          color = "darkslategray4",
          size = 1,
          fill = NA)

For our final map, we changed the aesthetics of our map to emphasize the Red Line path, shown in bright white over a mostly-black base map. We hope to learn how to manipulate the graphics with more expertise so that we can more easily highlight the pieces of data that we are most critical.

ggplot(MBTA_ridership_AM_PEAK)+
  annotation_map_tile(zoomin = 0, progress = "none", type = "cartodark")+
  geom_sf(aes(size = average_flow,
              color = stop_name))+
  labs(caption = "Map tiles and data by OpenStreetMap")+
  geom_sf(data = boundary,
          color = NA,
          size = 1,
          fill = "white",
          alpha = 0.1)+
  geom_sf(data = subway_line_red,
          color = "white",
          size = 1,
          fill = NA)

Contribution statements–

Arielle: I have a good grasp of the very basics of R (loading data, installing packages, and other basic commands) coming into this class, but quickly ran into many roadblocks with this assignment that shot down any confidence I had going into it. I struggled to download one of the primary datasets we chose for this project and spent a lot of time troubleshooting to learn how to download it. While I was working through that, I relied heavily on Lei who was able to successfully download the data and she and I worked on some of the initial code to merge the datasets and plot them on her computer. One we collectively figured out how to render maps with our data, I spent time on the final iterations of the various maps and worked collaboratively with Lei to think through different ways to visualize commuter flows through the Cambridge Red Line stations. At the end of our project, I took the lead on familiarizing myself with GitHub, and though I ran into trouble with authenticating GitHub in RStudio and spent a good amount of time working to fix it, I eventually worked through it and was able to get our RMarkdown file uploaded to it. Taylor: I personally had a lot of trouble keeping up with my group because the data we were using was a bit more complex then what was assigned. However, I was able to figure out how to connect my R Studio to GitHub (both Arielle and I both are Mac users and were running into some authentication issues). I also tried to be as supportive as possible and follow along with my group members as they were working through the different coding road blocks they were running into. Lei: For this project, I was interested in commuting time and was pointed to the MBTA Open Data Portal by our TA Jonathan. There, our group found the ridership file, of which we realized did not have spatial data. I tried manually adding the spatial information onto the data file but failed. However, after consulting Megan and Carole together as a group, the “mutate” function worked, and we all learned a new skill. I shared preliminary ideas on plotting after the data was all right, and worked collaboratively with Arielle on iterations. We wanted to try out more functions, such as filtering and changing names of variables in the plot and legend, and although not all successful, we intend on continuing to learn more. In other instances, I helped our team mate Taylor with familiarizing R after I had finished following the tutorial and she encountered some issues quite early on.

Arielle: 10 points Lei: 10 points Taylor: 10 points